7 research outputs found

    Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks

    Full text link
    Neural network (NN) interatomic potentials provide fast prediction of potential energy surfaces, closely matching the accuracy of the electronic structure methods used to produce the training data. However, NN predictions are only reliable within well-learned training domains, and show volatile behavior when extrapolating. Uncertainty quantification approaches can flag atomic configurations for which prediction confidence is low, but arriving at such uncertain regions requires expensive sampling of the NN phase space, often using atomistic simulations. Here, we exploit automatic differentiation to drive atomistic systems towards high-likelihood, high-uncertainty configurations without the need for molecular dynamics simulations. By performing adversarial attacks on an uncertainty metric, informative geometries that expand the training domain of NNs are sampled. When combined to an active learning loop, this approach bootstraps and improves NN potentials while decreasing the number of calls to the ground truth method. This efficiency is demonstrated on sampling of kinetic barriers and collective variables in molecules, and can be extended to any NN potential architecture and materials system.Comment: 12 pages, 4 figures, supporting informatio

    Representations of Materials for Machine Learning

    Full text link
    High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.Comment: 20 pages, 5 figures, To Appear in Annual Review of Materials Research 5

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks

    No full text
    AbstractNeural network (NN) interatomic potentials provide fast prediction of potential energy surfaces, closely matching the accuracy of the electronic structure methods used to produce the training data. However, NN predictions are only reliable within well-learned training domains, and show volatile behavior when extrapolating. Uncertainty quantification methods can flag atomic configurations for which prediction confidence is low, but arriving at such uncertain regions requires expensive sampling of the NN phase space, often using atomistic simulations. Here, we exploit automatic differentiation to drive atomistic systems towards high-likelihood, high-uncertainty configurations without the need for molecular dynamics simulations. By performing adversarial attacks on an uncertainty metric, informative geometries that expand the training domain of NNs are sampled. When combined with an active learning loop, this approach bootstraps and improves NN potentials while decreasing the number of calls to the ground truth method. This efficiency is demonstrated on sampling of kinetic barriers, collective variables in molecules, and supramolecular chemistry in zeolite-molecule interactions, and can be extended to any NN potential architecture and materials system.</jats:p

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text

    Empagliflozin in Patients with Chronic Kidney Disease

    No full text
    Background The effects of empagliflozin in patients with chronic kidney disease who are at risk for disease progression are not well understood. The EMPA-KIDNEY trial was designed to assess the effects of treatment with empagliflozin in a broad range of such patients. Methods We enrolled patients with chronic kidney disease who had an estimated glomerular filtration rate (eGFR) of at least 20 but less than 45 ml per minute per 1.73 m(2) of body-surface area, or who had an eGFR of at least 45 but less than 90 ml per minute per 1.73 m(2) with a urinary albumin-to-creatinine ratio (with albumin measured in milligrams and creatinine measured in grams) of at least 200. Patients were randomly assigned to receive empagliflozin (10 mg once daily) or matching placebo. The primary outcome was a composite of progression of kidney disease (defined as end-stage kidney disease, a sustained decrease in eGFR to &lt; 10 ml per minute per 1.73 m(2), a sustained decrease in eGFR of &amp; GE;40% from baseline, or death from renal causes) or death from cardiovascular causes. Results A total of 6609 patients underwent randomization. During a median of 2.0 years of follow-up, progression of kidney disease or death from cardiovascular causes occurred in 432 of 3304 patients (13.1%) in the empagliflozin group and in 558 of 3305 patients (16.9%) in the placebo group (hazard ratio, 0.72; 95% confidence interval [CI], 0.64 to 0.82; P &lt; 0.001). Results were consistent among patients with or without diabetes and across subgroups defined according to eGFR ranges. The rate of hospitalization from any cause was lower in the empagliflozin group than in the placebo group (hazard ratio, 0.86; 95% CI, 0.78 to 0.95; P=0.003), but there were no significant between-group differences with respect to the composite outcome of hospitalization for heart failure or death from cardiovascular causes (which occurred in 4.0% in the empagliflozin group and 4.6% in the placebo group) or death from any cause (in 4.5% and 5.1%, respectively). The rates of serious adverse events were similar in the two groups. Conclusions Among a wide range of patients with chronic kidney disease who were at risk for disease progression, empagliflozin therapy led to a lower risk of progression of kidney disease or death from cardiovascular causes than placebo
    corecore